智能论文笔记

Using Under-trained Deep Ensembles to Learn Under Extreme Label Noise

Konstantinos Nikolaidis , Thomas Plagemann , Stein Kristiansen , Vera Goebel , Mohan Kankanhalli

分类：机器学习 | (统计)机器学习

2020-09-23

错误或错误的标签可以对监督学习的可靠概括构成障碍。这可能具有负面后果，特别是对于诸如医疗保健的关键领域。我们提出了一种在极端标签噪声下学习的有效新方法，基于培训的深度乐观。每个集合构件都接受了培训数据的子集培训，以获取决策边界分离的一般概述，而不关注可能错误的细节。合并的累积知识组合以形成新的标签，确定比原始标签更好的类别分离。尽管标签噪声，但是使用这些标签培训了一个新模型，以可靠地概括。我们专注于医疗保健环境，并广泛评估我们对睡眠呼吸暂停检测任务的方法。为了与相关工作进行比较，我们还评估了数字识别的任务。在我们的实验中，我们观察到数字分类的任务和kappa的任务从6.7 \％的准确性提高到49.3 \％。

translated by 谷歌翻译

Learning Realistic Patterns from Unrealistic Stimuli: Generalization and Data Anonymization

Konstantinos Nikolaidis , Stein Kristiansen , Thomas Plagemann , Vera Goebel , Knut Liestøl , Mohan Kankanhalli , Gunn Marit Traaen , Britt Øverland , Harriet Akre , Lars Aakerøy

分类：机器学习 | (统计)机器学习

2020-09-21

良好的培训数据是开发有用的ML应用程序的先决条件。但是，在许多域中，现有数据集不能由于隐私法规（例如，从医学研究）而被共享。这项工作调查了一种简单而非规范的方法，可以匿名数据综合来使第三方能够受益于此类私人数据。我们探讨了从不切实际，任务相关的刺激中隐含地学习的可行性，这通过激发训练有素的深神经网络（DNN）的神经元来合成。因此，神经元励磁用作伪生成模型。刺激数据用于培训新的分类模型。此外，我们将此框架扩展以抑制与特定个人相关的表示。我们使用开放和大型闭合临床研究的睡眠监测数据，并评估（1）最终用户是否可以创建和成功使用定制分类模型进行睡眠呼吸暂停检测，并且（2）研究中参与者的身份受到保护。广泛的比较实证研究表明，在刺激上培训的不同算法能够在与原始模型相同的任务上成功概括。然而，新和原始模型之间的架构和算法相似性在性能方面发挥着重要作用。对于类似的架构，性能接近使用真实数据（例如，精度差为0.56 \％，Kappa系数差为0.03-0.04）。进一步的实验表明，刺激可以在很大程度上成功地匿名匿名研究临床研究的参与者。

translated by 谷歌翻译

3D Neuron Morphology Analysis

Jiaxiang Jiang , Michael Goebel , Cezar Borba , William Smith , B. S. Manjunath

分类：计算机视觉

2022-12-14

We consider the problem of finding an accurate representation of neuron shapes, extracting sub-cellular features, and classifying neurons based on neuron shapes. In neuroscience research, the skeleton representation is often used as a compact and abstract representation of neuron shapes. However, existing methods are limited to getting and analyzing "curve" skeletons which can only be applied for tubular shapes. This paper presents a 3D neuron morphology analysis method for more general and complex neuron shapes. First, we introduce the concept of skeleton mesh to represent general neuron shapes and propose a novel method for computing mesh representations from 3D surface point clouds. A skeleton graph is then obtained from skeleton mesh and is used to extract sub-cellular features. Finally, an unsupervised learning method is used to embed the skeleton graph for neuron classification. Extensive experiment results are provided and demonstrate the robustness of our method to analyze neuron morphology.

translated by 谷歌翻译

Design Space Exploration and Explanation via Conditional Variational Autoencoders in Meta-model-based Conceptual Design of Pedestrian Bridges

Vera M. Balmer , Sophia V. Kuhn , Rafael Bischof , Luis Salamanca , Walter Kaufmann , Fernando Perez-Cruz , Michael A. Kraus

分类：机器学习

2022-11-29

For conceptual design, engineers rely on conventional iterative (often manual) techniques. Emerging parametric models facilitate design space exploration based on quantifiable performance metrics, yet remain time-consuming and computationally expensive. Pure optimisation methods, however, ignore qualitative aspects (e.g. aesthetics or construction methods). This paper provides a performance-driven design exploration framework to augment the human designer through a Conditional Variational Autoencoder (CVAE), which serves as forward performance predictor for given design features as well as an inverse design feature predictor conditioned on a set of performance requests. The CVAE is trained on 18'000 synthetically generated instances of a pedestrian bridge in Switzerland. Sensitivity analysis is employed for explainability and informing designers about (i) relations of the model between features and/or performances and (ii) structural improvements under user-defined objectives. A case study proved our framework's potential to serve as a future co-pilot for conceptual design studies of pedestrian bridges and beyond.

translated by 谷歌翻译

Employing Graph Representations for Cell-level Characterization of Melanoma MELC Samples

Luis Carlos Rivera Monroy , Leonhard Rist , Martin Eberhardt , Christian Ostalecki , Andreas Baur , Julio Vera , Katharina Breininger , Andreas Maier

分类：计算机视觉 | 人工智能

2022-11-10

Histopathology imaging is crucial for the diagnosis and treatment of skin diseases. For this reason, computer-assisted approaches have gained popularity and shown promising results in tasks such as segmentation and classification of skin disorders. However, collecting essential data and sufficiently high-quality annotations is a challenge. This work describes a pipeline that uses suspected melanoma samples that have been characterized using Multi-Epitope-Ligand Cartography (MELC). This cellular-level tissue characterisation is then represented as a graph and used to train a graph neural network. This imaging technology, combined with the methodology proposed in this work, achieves a classification accuracy of 87%, outperforming existing approaches by 10%.

translated by 谷歌翻译

Depression Symptoms Modelling from Social Media Text: An Active Learning Approach

Nawshad Farruque , Randy Goebel , Sudhakar Sivapalan , Osmar Zaiane

分类：自然语言处理 | 人工智能 | 机器学习

2022-09-06

基于社交媒体语言的临床抑郁模型的基本组成部分是抑郁症状检测（DSD）。不幸的是，没有任何DSD数据集都反映出自lif污抑郁症的样本中抑郁症状的临床见解和分布。在我们的工作中，我们描述了一个主动学习框架（AL）框架，该框架使用了最初的监督学习模型1）1）最先进的大型心理健康论坛文本文本预训练的语言模型在临床医生注释的临床医生上进行了微调DSD数据集，2）DSD的零拍学习模型，并将它们融合在一起，从我们大型自我策划的抑郁症推文存储库（DTR）中收获抑郁症状相关的样本。我们的临床医生注释的数据集是同类数据集中最大的数据集。此外，DTR是由自披露的抑郁用户在两个数据集中的Twitter时间轴中创建的，其中包括从Twitter中检测到用户级抑郁症的最大基准数据集之一。这进一步有助于保留自张开的Twitter用户推文的抑郁症状分布。随后，我们使用收获的数据迭代地重新训练我们的初始DSD模型。我们讨论了该过程的停止标准和局限性，并阐述了在整个AL过程中起着至关重要的作用的基础构造。我们证明我们可以生产最终的数据集，这是同类产品中最大的数据集。此外，对其进行训练的DSD和抑郁症检测（DPD）模型的精度明显优于初始版本。

translated by 谷歌翻译

Exploring Popularity Bias in Music Recommendation Models and Commercial Steaming Services

Douglas R. Turnbull , Sean McQuillan , Vera Crabtree , John Hunter , Sunny Zhang

分类：机器学习

2022-08-19

受欢迎程度的偏见是，推荐系统将在向用户推荐艺术家时过度偏爱流行艺术家。因此，他们可能会为赢家众多的市场做出贡献，其中少数艺术家几乎受到了所有关注，而同样不太可能被发现。在本文中，我们尝试衡量三种最先进的推荐系统模型（例如Slim，Multi-Vae，WRMF）和三种商用音乐流服务（Spotify，Amazon Music，YouTube）中的流行偏见。我们发现，最准确的模型（Slim）也具有最受欢迎的偏见，而准确的模型的流行性偏差较小。我们还没有根据模拟用户实验发现商业建议中流行偏见的证据。

translated by 谷歌翻译

Deep Learning Enabled Time-Lapse 3D Cell Analysis

Jiaxiang Jiang , Amil Khan , S. Shailja , Samuel A. Belteton , Michael Goebel , Daniel B. Szymanski , B. S. Manjunath

分类：计算机视觉

2022-08-17

本文提出了一种延时3D细胞分析的方法。具体而言，我们考虑了准确定位和定量分析亚细胞特征的问题，以及从延时3D共聚焦细胞图像堆栈跟踪单个细胞的问题。细胞的异质性和多维图像的体积提出了对细胞形态发生和发育的完全自动化分析的主要挑战。本文是由路面细胞生长过程和构建定量形态发生模型的动机。我们提出了一种基于深度特征的分割方法，以准确检测和标记每个细胞区域。基于邻接图的方法用于提取分段细胞的亚细胞特征。最后，提出了使用多个单元格特征的基于强大的图形跟踪算法在不同的时间实例中关联单元格。提供了广泛的实验结果，并证明了所提出的方法的鲁棒性。该代码可在GitHub上获得，该方法可通过Bisque Portal作为服务可用。

translated by 谷歌翻译

Towards the Use of Saliency Maps for Explaining Low-Quality Electrocardiograms to End Users

Ana Lucic , Sheeraz Ahmad , Amanda Furtado Brinhosa , Vera Liao , Himani Agrawal , Umang Bhatt , Krishnaram Kenthapadi , Alice Xiang , Maarten de Rijke , Nicholas Drabowski

分类：机器学习 | 人工智能

2022-07-06

当使用临床医生或人工智能（AI）系统的医学图像进行诊断时，重要的是图像具有高质量。当图像质量低时，产生图像的体检通常需要重做。在远程医疗中，一个普遍的问题是，只有在患者离开诊所后才标记质量问题，这意味着他们必须返回才能重做考试。对于居住在偏远地区的人们来说，这可能是特别困难的，他们在巴西的数字医疗组织Portemedicina占了大部分患者。在本文中，我们报告了有关（i）实时标记和解释低质量医学图像的AI系统的正在进行的工作，（ii）采访研究，以了解使用AI系统的利益相关者的解释需求在OurCompany和（iii）纵向用户研究设计，旨在检查包括对我们诊所中技术人员工作流程的解释的效果。据我们所知，这将是评估XAI方法对最终用户的影响的首次纵向研究 - 使用AI系统但没有AI特定专业知识的利益相关者。我们欢迎对我们的实验设置的反馈和建议。

translated by 谷歌翻译

Flexible text generation for counterfactual fairness probing

Zee Fryer , Vera Axelrod , Ben Packer , Alex Beutel , Jilin Chen , Kellie Webster

分类：自然语言处理

2022-06-28

在基于文本的分类器中测试公平性问题的一种常见方法是通过使用反事实来：如果更改输入中的敏感属性，则分类器输出是否会更改？现有的反事实生成方法通常依赖于单词列表或模板，产生不考虑语法，上下文或微妙敏感属性引用的简单反事实，并且可能会错过WordList创建者未考虑的问题。在本文中，我们介绍了一项为克服这些缺点而产生的反事实的任务，并证明了如何利用大型语言模型（LLM）来在此任务上取得进展。我们表明，这种基于LLM的方法可以产生现有方法无法实现的复杂反事实，从而比较了民事评论数据集中各种反事实生成方法的性能，并在评估毒性分类器时显示出它们的价值。

translated by 谷歌翻译